skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Vaidya, Jaideep"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 1, 2026
  2. Free, publicly-accessible full text available May 1, 2026
  3. Free, publicly-accessible full text available May 1, 2026
  4. Nikolski, Macha (Ed.)
    Abstract MotivationGenome-wide association studies (GWAS) benefit from the increasing availability of genomic data and cross-institution collaborations. However, sharing data across institutional boundaries jeopardizes medical data confidentiality and patient privacy. While modern cryptographic techniques provide formal secure guarantees, the substantial communication and computational overheads hinder the practical application of large-scale collaborative GWAS. ResultsThis work introduces an efficient framework for conducting collaborative GWAS on distributed datasets, maintaining data privacy without compromising the accuracy of the results. We propose a novel two-step strategy aimed at reducing communication and computational overheads, and we employ iterative and sampling techniques to ensure accurate results. We instantiate our approach using logistic regression, a commonly used statistical method for identifying associations between genetic markers and the phenotype of interest. We evaluate our proposed methods using two real genomic datasets and demonstrate their robustness in the presence of between-study heterogeneity and skewed phenotype distributions using a variety of experimental settings. The empirical results show the efficiency and applicability of the proposed method and the promise for its application for large-scale collaborative GWAS. Availability and implementationThe source code and data are available at https://github.com/amioamo/TDS. 
    more » « less
  5. Abstract The rapid improvements in genomic sequencing technology have led to the proliferation of locally collected genomic datasets. Given the sensitivity of genomic data, it is crucial to conduct collaborative studies while preserving the privacy of the individuals. However, before starting any collaborative research effort, the quality of the data needs to be assessed. One of the essential steps of the quality control process is population stratification: identifying the presence of genetic difference in individuals due to subpopulations. One of the common methods used to group genomes of individuals based on ancestry is principal component analysis (PCA). In this article, we propose a privacy-preserving framework which utilizes PCA to assign individuals to populations across multiple collaborators as part of the population stratification step. In our proposed client-server-based scheme, we initially let the server train a global PCA model on a publicly available genomic dataset which contains individuals from multiple populations. The global PCA model is later used to reduce the dimensionality of the local data by each collaborator (client). After adding noise to achieve local differential privacy (LDP), the collaborators send metadata (in the form of their local PCA outputs) about their research datasets to the server, which then aligns the local PCA results to identify the genetic differences among collaborators’ datasets. Our results on real genomic data show that the proposed framework can perform population stratification analysis with high accuracy while preserving the privacy of the research participants. 
    more » « less
  6. Symptoms-tracking applications allow crowdsensing of health and location related data from individuals to track the spread and outbreaks of infectious diseases. During the COVID-19 pandemic, for the first time in history, these apps were widely adopted across the world to combat the pandemic. However, due to the sensitive nature of the data collected by these apps, serious privacy concerns were raised and apps were critiqued for their insufficient privacy safeguards. The Covid Nearby project was launched to develop a privacy-focused symptoms-tracking app and to understand the privacy preferences of users in health emergencies. In this work, we draw on the insights from the Covid Nearby users' data, and present an analysis of the significantly varying trends in users' privacy preferences with respect to demographics, attitude towards information sharing, and health concerns, e.g. after being possibly exposed to COVID-19. These results and insights can inform health informatics researchers and policy designers in developing more socially acceptable health apps in the future. 
    more » « less